Optimization for latency reduction in Product Quantization #397

AbhijitKulkarni1 · 2025-02-28T08:31:35Z

This PR optimizes the io.github.jbellis.jvector.quantization.KMeansPlusPlusClusterer#chooseInitialCentroids implementation using Java Vector API for the following cases.

Calculate minimums in a vectorized way instead of scalar fashion one at a time in loop.
Optimize scalar addition using vectorization.

Tested the optimized code with the setup as below and observed latency reduction of ~6% in Product Quantization for ada002-100k dataset execution.

Run Setup:
Jvector release used: 4.0.0-beta.1
JDK used:
openjdk version "23.0.1" 2024-10-15
OpenJDK Runtime Environment (build 23.0.1+11-39)
OpenJDK 64-Bit Server VM (build 23.0.1+11-39, mixed mode, sharing)

Socket 1 of m7i.metal-48xl machine used

To check the benefit of this optimization reasonably, following changes were done in Bench.java, PQParameters.java and Grid.java:

In buildCompression and searchCompression parameters, enabled only PQ parameters.
Disabled the caching for PQParameters.
Disabled the testConfiguration calls, to measure latency only for quantization and Index building. Observed Index building time is almost same in both baseline and optimized runs:
io/github/jbellis/jvector/example/Grid.java:141 --> /*indexes.forEach((features, index) -> { ...});*/

marianotepper · 2025-02-28T23:02:05Z

Thank you for your contribution! This PR looks really good.

I have a couple of minor comments and suggestions.

For helping with code readability, I have a few suggestions for placing the minInPlace in the following files:

VectorUtil.java -> line 160
VectorUtilSupport.java -> line 83
DefaultVectorUtilSupport.java -> line 289
NativeVectorUtilSupport.java -> line 118
PanamaVectorUtilSupport.java -> line 107
SimdOps.java -> line 614

No need to change the contents of the function or its signature, just its line placement. This way, it is closer to functions that are somewhat related.

In line 218 of KMeansPlusPlusClusterer.java, it seems like zeroing the vector makes no difference as we are replacing every value in the next loop. Is this correct?

AbhijitKulkarni1 · 2025-03-01T03:14:46Z

Thank you for your contribution! This PR looks really good.

I have a couple of minor comments and suggestions.

For helping with code readability, I have a few suggestions for placing the minInPlace in the following files:

VectorUtil.java -> line 160

VectorUtilSupport.java -> line 83

DefaultVectorUtilSupport.java -> line 289

NativeVectorUtilSupport.java -> line 118

PanamaVectorUtilSupport.java -> line 107

SimdOps.java -> line 614

No need to change the contents of the function or its signature, just its line placement. This way, it is closer to functions that are somewhat related.

In line 218 of KMeansPlusPlusClusterer.java, it seems like zeroing the vector makes no difference as we are replacing every value in the next loop. Is this correct?

Thanks for your suggestions, I will update accordingly. Yes, you are correct about line 218 of KMeansPlusPlusClusterer.java, for cleaner implementation had set to zero. Please let me know if you are of the opinion to remove it?

sam-herman · 2025-03-01T03:57:31Z

@marianotepper @AbhijitKulkarni1 if this helps, I added an index construction and PQ benchmarks you can use to test the results.
#398

Optimization for latency reduction in Product Quantization

5a97b43

marianotepper self-requested a review February 28, 2025 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization for latency reduction in Product Quantization #397

Optimization for latency reduction in Product Quantization #397

AbhijitKulkarni1 commented Feb 28, 2025

marianotepper commented Feb 28, 2025 •

edited

Loading

AbhijitKulkarni1 commented Mar 1, 2025

sam-herman commented Mar 1, 2025 •

edited

Loading

Optimization for latency reduction in Product Quantization #397

Are you sure you want to change the base?

Optimization for latency reduction in Product Quantization #397

Conversation

AbhijitKulkarni1 commented Feb 28, 2025

marianotepper commented Feb 28, 2025 • edited Loading

AbhijitKulkarni1 commented Mar 1, 2025

sam-herman commented Mar 1, 2025 • edited Loading

marianotepper commented Feb 28, 2025 •

edited

Loading

sam-herman commented Mar 1, 2025 •

edited

Loading